Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance
نویسندگان
چکیده
MOTIVATION Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. RESULTS Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. The algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. AVAILABILITY AND IMPLEMENTATION Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Deconvoluting simulated metagenomes: the performance of hard- and soft- clustering algorithms applied to metagenomic chromosome conformation capture (3C)
BACKGROUND Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (str...
متن کاملmetaSNV: A tool for metagenomic strain level analysis
We present metaSNV, a tool for single nucleotide variant (SNV) analysis in metagenomic samples, capable of comparing populations of thousands of bacterial and archaeal species. The tool uses as input nucleotide sequence alignments to reference genomes in standard SAM/BAM format, performs SNV calling for individual samples and across the whole data set, and generates various statistics for indiv...
متن کاملExpression of heterologous sigma factors enables functional screening of metagenomic and heterologous genomic libraries
A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD ...
متن کاملMetagenomic Profiling of Known and Unknown Microbes with MicrobeGPS
Microbial community profiling identifies and quantifies organisms in metagenomic sequencing data using either reference based or unsupervised approaches. However, current reference based profiling methods only report the presence and abundance of single reference genomes that are available in databases. Since only a small fraction of environmental genomes is represented in genomic databases, th...
متن کاملA Metagenomic Analysis of Lung Microbiome in Chemically Injured and Healthy Individuals
Background and Aim: The role of the lung microbiome in respiratory complications associated with chemicals such as sulfur mustard or chlorine gas has yet to be determined. The aim of this study was to compare the structure and composition of the lung microbiome in chemically injured and healthy individuals in order to understand the relation between the population of the lung microbiota and res...
متن کامل